MORC proteins regulate transcription factor binding by mediating chromatin compaction in active chromatin regions

Background The microrchidia (MORC) proteins are a family of evolutionarily conserved GHKL-type ATPases involved in chromatin compaction and gene silencing. Arabidopsis MORC proteins act in the RNA-directed DNA methylation (RdDM) pathway, where they act as molecular tethers to ensure the efficient establishment of RdDM and de novo gene silencing. However, MORC proteins also have RdDM-independent functions although their underlying mechanisms are unknown. Results In this study, we examine MORC binding regions where RdDM does not occur in order to shed light on the RdDM-independent functions of MORC proteins. We find that MORC proteins compact chromatin and reduce DNA accessibility to transcription factors, thereby repressing gene expression. We also find that MORC-mediated repression of gene expression is particularly important under conditions of stress. MORC-regulated transcription factors can in some cases regulate their own transcription, resulting in feedback loops. Conclusions Our findings provide insights into the molecular mechanisms of MORC-mediated chromatin compaction and transcription regulation. Supplementary Information The online version contains supplementary material available at 10.1186/s13059-023-02939-4.

The Arabidopsis genome encodes six MORC proteins: MORC1, 2, 4, 5, 6, and 7 (MORC3 being a pseudogene) [6]. These six proteins are functionally redundant and colocalize with sites of RNA-directed DNA methylation (RdDM) genome-wide [7], where they are critical for establishing efficient RdDM and de novo gene silencing [7]. MORC7, when tethered to DNA using an artificial zinc finger, can target RdDM to ectopic sites. MORC7 is also required for the silencing of newly integrated FWA transgenes [7]. MORC proteins also act downstream of DNA methylation to suppress gene expression and are also involved in plant immunity -protecting plants against potential pathogens by interacting with plant resistance (R) proteins [8,9]. However, the molecular mechanisms underlying these RdDM-independent functions remain unknown. We previously observed MORC binding sites where RdDM does not occur (MORC-unique sites) [7], and by studying these sites, we aim to shed light on the mechanisms underlying the RdDM-independent functions of MORC proteins. TOPLESS (TPL) and LEUNIG (LUG) are both Grocho (Gro)/TLE-type transcriptional co-repressors in plants. They are characterized by a conserved glutamine-rich C-terminal domain and an N-terminal WD-repeat domain [10]. The glutamine-rich domain participates in protein oligomerization, and the WD-repeat domain interacts with downstream transcriptional regulators [10]. The functional counterpart of the Gro/ TLE family of proteins in yeast, Tup1, was originally identified as a co-repressor that occupied the binding sites of transcriptional activators [11,12]. However, evidence now shows that Tup1 can switch from a co-repressor to a co-activator in response to stress, and is required for the activation of certain genes related to the stress response [11,12].
Here, we use MORC-unique sites to study the RdDM-independent functions of MORC proteins. We show that MORC proteins compact chromatin and reduce DNA accessibility to TFs, thereby repressing the transcription of stress-responsive genes.

MORC proteins bind to active chromatin regions devoid of RdDM
We previously reported that approximately 80% of MORC7 binding regions overlap with sites of RdDM [7]. MORC7 is recruited to these sites by the RdDM machinery, where it then facilitates the efficiency of the RdDM pathway. However, the remaining 20% of MORC7 binding sites are devoid of RdDM, as evidenced by a lack of Pol V occupancy [7]. The mechanisms underlying the function of MORC7 within these RdDM-depleted regions remain unknown.
Mouse MORC3 recognizes and localizes to regions of H3K4me3-marked chromatin through its CW domain [13]; however, Arabidopsis MORCs do not contain CW domains. To determine whether Arabidopsis MORCs co-localize with specific chromatin features, we used the ChromHMM method to investigate correlations between MORC7 and several well-characterized chromatin features (H3K9ac, H3K27ac, H4K16ac, H3K4me1, H3K4me3, H3K36me2, H3K36me3, H3K9me2, H3K27me3, Pol II, and Pol V). We analyzed chromatin states using a similar method as previously reported [14] but also included Pol V ChIP-seq data. We found 13 different chromatin states (Additional file 1: Fig. S1). MORC7 showed a strong correlation with Pol V (a known indicator of RdDM sites), which was consistent with our previous findings (State 11, Additional file 1: Fig. S1). Chromatin state 12 included sites enriched with MORC7 but depleted of Pol V -indicative of MORC7-unique regions. We did not observe enrichment of histone marks in these MORC7-unique regions (Additional file 1: Fig. S1).
Within these MORC7-unique regions, we identified two subgroups: MORC7A and MORC7B. The ChIP-seq data for MORC7, Pol V, ATAC-seq, transposable element (TE), and small RNA density indicated that MORC7A was within a region of high chromatin accessibility and low TE density ( Fig. 1a, b, Additional file 1: Fig. S2). Consistent with the ChromHMM analysis, MORC7A displayed low levels of histone occupancy and histone modification, although its flanking regions were enriched for active histone modifications. This suggests that MORC7A is located within an active chromatin compartment between genes (Fig. 1b).
MORC7B contained a high density of TE with no apparent active histone marks, reflective of its heterochromatic localization (Fig. 1a, b). We found that MORC7A regions had low levels of DNA methylation, while MORC7B regions had high levels of methylation (Fig. 1c, d). These results suggest that MORC7 binds to active and deep heterochromatic regions of DNA, where RdDM does not occur, suggesting that it regulates gene expression at these sites through RdDM-independent mechanisms.

MORC7 preferentially binds to the promoters of TFs
The genomic distribution enrichment data showed enrichment of MORC7A peaks over promoters (Fig. 2a), but no enrichment of MORC7B peaks -consistent with their deep heterochromatic localization. The functional annotation of the genes proximal to MORC7A suggested that they were enriched in TF encoding genes ( Table 1). The Arabidopsis genome encodes approximately 1491 TF genes (5.5% of the genome) [15]. Of  Table S1).  This enrichment was more significant than enrichment for MORC7-Pol V common (p-value = 0.001) and Pol V-unique (p-value = 1.6E − 5). Gene Ontology (GO) term analysis of genes proximal to MORC7A showed an enrichment of negative regulation of auxin metabolic process (~ 80-fold), shade avoidance (~ 30 fold), and the primary shoot apical meristem specification pathway (~ 30-fold) (Fig. 2b, c, d). The primary shoot apical meristem specification pathway (GO0010072) is responsible for the growth of all post-embryonic, above-ground plant structures [16]. In Arabidopsis, this pathway includes several topless-related genes [16]. Interestingly, we found that MORC7 specifically bound to 12 of the 19 genes in this pathway (Fig. 2e, Additional file 1: Fig. S3), and co-localized with Pol V at an additional three. We show examples of MORC7 enrichment over the promoter regions for the four TOPLESS genes in Additional file 1: Fig. S3.

MORC7 closely co-localizes with some TFs
To investigate the protein interaction network of MORC7 with chromatin, we re-analyzed previously published crosslinked IP-MS data of MORC7 [7]. We identified 494 proteins (FDR < 0.05, FC > 2) that interacted with MORC7 ( Fig. 3a), and found that many of these were involved in either chromatin-related pathways or development (Fig. 3b). We also identified 68 TFs from the MORC7 interacting proteins (68/494, p = 7.89E − 12)  Table S2). To further test whether MORC7 co-localizes with TFs, we obtained binding site information for 200 TFs from the DNA Affinity Purification and sequencing (DAP-seq) database [17], and performed pairwise peak overlap analysis with MORC7 peaks. We found that MORC7A showed stronger co-localization with TFs compared to MORC7B, MORC7-Pol V common, and Pol V-unique regions, and also showed strong co-localization with some TF binding sites but not others (Fig. 3c). For the TFs characterized with DAP-seq, we found 23 TFs pull downed by MORC7 crosslinked IP-MS data (Additional file 3: Table S2). This indicates that MORC7A peaks are associated with TF binding sites. We also re-analyzed three transcription factors, PIF4 [18], ARF6 [19], and TPR1 [20], in particular, because published ChIP-seq data was available. Metaplot analysis with ChIP-seq data indicated that MORC7A-unique, MORC7B-unique, MORC7-Pol V Common, and Pol V-unique regions showed the strongest co-localization with MORC7A, but the random promoter controls didn't show obvious enrichment (Fig. 3d, Additional file 1: Fig. S4).

MORC7 influences TF binding through chromatin compaction
To understand how MORC7 affects chromatin conformation, we performed an Assay for Transposase-Accessible Chromatin with high-throughput sequencing (ATAC-seq) in morc4 morc7, morc6, and morc hextuple (morchex, in which all functional MORCs are knocked out) mutants [6]. We plotted ATAC-seq data across the four groups and found that MORC7A, MORC7B, and MORC7-Pol V common showed greater chromatin accessibility changes in the mutants, particularly over the MORC7A regions (Fig. 4a,  b). This phenotype is consistently observed in morc4 morc7, morc6, and morchex -with morchex showing the most pronounced phenotype (Fig. 4a, Additional file 1: Fig. S5). Interestingly, for Pol V-unique sites, DNA compaction was not reduced, but actually became slightly increased (Fig. 4a). Consistently, we also observed an increase in DNA methylation for Pol V-unique sites in the mutants (Fig. 1d). This suggests that Pol V may be redistributed from the MORC7-Pol V common sites to Pol V-unique sites in the absence of MORC proteins. This is consistent with our previous findings that suggested MORC proteins function as molecular tethers to facilitate the recruitment of RdDM components [7].
To examine whether MORC-mediated DNA compaction affects TFs, we analyzed the ATAC-seq data for TF footprints. When a TF binds to DNA, it inhibits the integration of DNA by Tn5 transposes, causing the binding motif to exhibit lower DNA accessibility, and the flanking regions to exhibit higher DNA accessibility [21]. The footprints of 572 TFs downloaded from JASPAR were analyzed in the morc4 morc7, morc6, and morchex mutants [22]. Many TFs showed substantially stronger apparent binding within the MORC7A regions in the mutants. There were some increases in binding within the MORC7B regions (although to a lesser degree than in MORC7A regions) (Fig. 4c, d), while TF binding over RdDM sites was largely unaffected (Fig. 4e, f ). The metaplot of ATAC-seq signals over the TF binding sites for the MORC7A regions confirmed that these TFs have stronger apparent binding in morc4 morc7, morc6, and morchex mutants -with morchex showing the strongest binding changes, and the random control regions showing no differences (Fig. 4g-i).
We previously showed that targeting either MORC7 or MORC6 ectopically in the fwa-4 epiallele background using ZF108 can trigger the silencing of FWA [7,23]. In addition to the FWA locus, ZF108 can also bind thousands of off-target sites [23]. These off-target sites are preferentially localized to promoter regions and therefore provide an excellent opportunity to test whether the presence of MORC proteins can affect TF binding. We compared TF footprints between ZF-MORC6 and fwa-4 and found a substantial decrease for many of the TF footprints in ZF-MORC6 plants (Fig. 4j). We further divided ZF off-target sites into sites that gain of DNA methylation (n = 2186) or non-gain of DNA methylation (n = 8580), and observed a substantial decrease of the TF footprints over both groups of sites in ZF-MORC6 plants (Additional file 1: Fig. S6). This supports the hypothesis that MORC proteins affect TF binding. Together, these results suggest that MORCs inhibit TF binding by altering chromatin accessibility.

MORC influences gene expression downstream of the TFs
To understand whether MORC proteins regulate gene expression, we performed RNA-seq with the morchex mutant. As MORC7A co-localizes strongly with PIF4 (Fig. 3d) -a central regulator in temperature signaling [24] -we applied heat treatment to the morchex mutant. We first compared the expression of genes proximal to MORC7A peaks in wild type (WT) and morchex mutant without treatment. This showed that the genes proximal to MORC7A were slightly (p = 0.002) up-regulated in the morchex mutant without treatment (Fig. 5a), including the TFs SEP3, PIF4, ARF6/8, TPR1, LUG, and SEU (Fig. 5b). After heat treatment, morchex displayed a stronger response compared to the WT, with significantly more upregulated genes (Fig. 5c, d). Genes proximal to MORC7A were enriched in shoot apical meristem specification pathways, and consistently, we observed stronger upregulation of these genes in morchex after heat treatment (Additional file 1: Fig. S7).
To confirm that MORC proteins affect TF binding, and to understand how they affect downstream gene expression, we selected two TFs, TOPLESS (TPL) and LEUNIG (LUG), for ChIP-seq analysis, because they were present in the MORC7 IP-MS data [7]. We expressed TPL and LUG fused with a 3XFLAG-tag in both WT and morchex. Consistent with the TF footprint analysis, both TPL and LUG displayed stronger binding at MORC7A regions, while only a slight increase in binding was noted for the MORC7-Pol V co-binding sites in morchex (Fig. 5e, f ) -consistent with an increase in chromatin accessibility in morchex (Fig. 5g). To test whether the binding strength of TPL and LUG might be higher in morchex, we ranked TPL and LUG binding sites based on the MORC ChIP-seq signals and divided them into three groups: high, middle and low (Fig. 5g). Overall, in the morchex mutant, we observed increased TPL and LUG binding, as well as increased chromatin accessibility across the regions with stronger MORC7 signals (Fig. 5g). We found that MORC7A-bound genes were downregulated (p = 5.7E − 13) in the lug mutant, suggesting that LUG may facilitate expression of these genes (Fig. 5a). Using ChIP-seq data together with RNA-seq data in the lug mutant, we identified 95 genes that appeared to be directly regulated by LUG (Additional file 4: Table S3). We found that these LUG-regulated genes were upregulated (p = 0.77) in morchex, particularly after heat treatment (p = 7.5E − 9, Fig. 5h). Motif enrichment analysis indicated that most overrepresented motifs were largely similar between wild-type and morchex mutant (Additional file 1: Fig. S8). However, we identified some peaks by the peak calling method MACS2 in morchex mutants that were not called as a peak in wild type. A metaplot of ChIP-seq signal over these regions showed a higher signal for TPL and LUG in morchex as compared to wild type (Additional file 1: Fig. S9).

Discussion
We previously reported that MORC proteins are localized to sites of RdDM throughout the genome, and function as molecular tethers to facilitate the efficient establishment of RdDM [7]. We showed that this RdDM-related function of MORC proteins is  [7,23]; however, this model does not explain other functions of MORC proteins. For example, MORC1 and MORC6 were shown to work downstream of DNA methylation to repress the expression of both the endogenous SDC gene and an SDC transgene, as well as other DNA-methylated targets in the genome [9]. In addition, morc mutants display various disease phenotypes; for example, Kang et al. [8] reported that morc1 is susceptible to Turnip Crinkle Virus (TRV), while Harris et al. [6] reported that morchex is susceptible to the Hyaloperonospora arabidopsidis (Hpa) strain, Emwa1. However, the molecular mechanisms underlying the additional functions of the MORC proteins remain unknown.
Here, we investigated the function of MORC7 in regions where RdDM does not occur, particularly those near genes where no DNA methylation is present. We found that MORC proteins reduce chromatin accessibility within these regions. Previous in vitro studies showed that C. elegans MORC1 homodimers can topologically entrap and condense DNA through further oligomerization of MORC1 proteins [2]. In addition, Arabidopsis morc mutants display pericentromeric heterochromatin decondensation [9], which takes place with minimal losses of DNA methylation throughout the genome. This indicates that MORC proteins contribute to chromatin compaction independently of DNA methylation [9]. We show here that MORC proteins reduce chromatin accessibility in methylation-free promoter regions of DNA, which may explain their mechanism for methylation-independent gene regulation. We suggest that Arabidopsis MORCs may use a similar mechanism of chromatin compaction to that of C. elegans MORC1 -compacting chromatin by topological entrapment, thereby reducing its accessibility to TFs.
Plant MORCs have been implicated in plant pathogen responses. MORCs promote resistance in some plant species and inhibit defense responses in others [8]. Upregulation of protein-coding genes was previously shown in morc4 morc7; although, the underlying mechanism of this was unknown [6]. Here, we report that MORC proteins regulate gene expression by compacting chromatin in promoter regions, thereby preventing access by TFs. In addition, MORCs preferentially bound to the promoter regions of TF genes, contributing to their regulation, and our crosslinked IP-MS data suggested that MORCs are in close proximity to many TFs. A previous study suggested that the proteins SUVH2 and SUVH9 bind to methylated DNA and recruit MORC proteins to RdDM loci to facilitate the efficiency of RdDM and gene silencing [25]. However, the question of how MORC proteins are recruited to regions devoid of RdDM remains to be answered. Interestingly, we observed that many of TFs interacting with MORC7 bind to their own promoters suggesting regulation by a feedforward loop, which may amplify the effects of MORCs on transcriptional networks.
Finally, we showed that MORC proteins are important for the regulation of gene expression, particularly under stress conditions. We also found altered expression of heat-responsive genes in morchex. Like with its role in plant pathogen defense response, it seems likely that the role of MORCs in stress responses relates to its effects on chromatin compaction of promoter regions and TF networks.

Conclusions
MORC proteins have a broad binding spectrum in the genome and appear to participate in at least three separate processes. They co-localize to sites of RdDM, facilitating efficient DNA methylation establishment [7], they are needed to repress DNA methylated areas of pericentromeric heterochromatin in a DNA methylation-independent manner [9], and they co-localize with TFs in unmethylated promoter regions, regulating TF binding and gene expression by altering chromatin accessibility (Fig. 5i). Although it seems likely that MORC act in each of these processes by topologically entrapping DNA, there are likely mechanistic differences that can explain the localization and function of MORCs in these three different epigenetic environments in the genome.

Epitope-tagged transgenic lines
Full-length genomic DNA fragments, including native promoter sequences, were cloned into pENTR/D vectors (Invitrogen), and then into modified destination vectors carrying 3xFLAG with LR Clonase (Invitrogen). All primers used in this study are available in Additional file 5: Table S4.

Nuclei extraction and ATAC-seq library preparation
The nuclei collection process from inflorescence and meristem tissues was performed in accordance with previously described methods [26,27]. Freshly isolated nuclei were used for ATAC-seq, as described elsewhere [28]. Inflorescence tissues were collected for extraction of nuclei as follows: 5 g (approximately) of inflorescence tissue was collected and immediately transferred into the ice-cold grinding buffer (300 mM sucrose, 20 mM Tris pH 8, 5 mM MgCl 2 , 5 mM KCl, 0.2% Triton X-100, 5 mM β-mercaptoethanol, and 35% glycerol); the samples were then ground with Omni International General Laboratory Homogenizer at 4 °C, and filtered through a twolayer Miracloth using a 40-µm nylon mesh Cell Strainer (Fisher). Samples were spin filtered for 10 min at 3000 g, the supernatant was discarded, and the pellet was resuspended with 25 ml of grinding buffer using a Dounce homogenizer. The wash step was performed twice in total. Nuclei were then resuspended in 0.5 ml of freezing buffer (50 mM Tris pH 8, 5 mM MgCl 2 , 20% glycerol, and 5 mM β-mercaptoethanol). Nuclei were then subjected to a transposition reaction with Tn5 (Illumina). For the transposition reaction, 25 µl of 2 × DMF (66 mM Tris-acetate pH 7.8, 132 mM K-Acetate, 20 mM Mg-Acetate, and 32% DMF) was mixed with 2.5 µl Tn5 and 22.5 µl nuclei suspension at 37 °C for 30 min. The transposed DNA fragments were then purified with ChIP DNA Clean & Concentrator Kit (Zymo). Libraries were prepared with Phusion High-Fidelity DNA Polymerase (NEB), in a system containing: 12.5 µl 2 × Phusion, 1.25 µl 10 mM Ad1 primer, 1.25 µl 10 mM Ad2 primer, 4 µl ddH2O, and 6 µl purified transposed DNA fragments. The ATAC-seq libraries were sequenced on a NovaSeq 6000 sequencer (Illumina).

RNA-seq library preparation
Total RNAs were extracted from ~ 100 mg of flower buds using TRIzol and the Directzol RNA Miniprep kit (Zymo, R2050). Sequencing libraries were prepared using the TruSeq Stranded mRNA Library Prep kit (Illumina), according to the manufacturer's instructions, and sequenced on a NovaSeq 6000 sequencer (Illumina).

ChIP-seq library preparation
10 g of inflorescence and meristem tissues were used for ChIP-seq. ChIP assays were performed as has been described previously [29]. Briefly, 2-4 g of flower tissue was collected from 4-to 5-week-old plants, and ground with liquid nitrogen. 1% formaldehyde containing a nuclei isolation buffer was used to fix the chromatin for ten minutes. Freshly prepared glycine was then used to terminate the crossing reaction. Shearing was performed via Bioruptor Plus (Diagenode), and immunoprecipitations with antibodies were performed overnight at 4 °C. The anti-FLAG M2 (Sigma) antibody was used in this study. Magnetic Protein A and Protein G Dynabeads (Invitrogen) were added and incubated at 4 °C for 2 h. The reverse crosslink was performed overnight at 65 °C. The protein-DNA mix was then treated with Protease K (Invitrogen) at 45 °C for 4 h. The DNA was purified and precipitated with 3 M Sodium Acetate (Invitrogen), glycoBlue (Invitrogen), and ethanol overnight at − 20 °C. The precipitated DNA was then used for library preparation using the Ovation Ultra Low System V2 kit (NuGEN), which was then sequenced using an Illumina NovaSeq 6000 sequencer.

Small RNA-seq analysis
Small RNA-seq reads were downloaded from a previous paper [30]. Adaptor sequence (TGG AAT TCT CGG ) was trimmed with trim_galore, and trimmed reads were mapped to the reference genome TAIR10 using Bowtie2 with only one unique hit and zero mismatches [31].

RNA-seq analysis
Cleaned short reads were aligned to the reference genome, TAIR10, by Bowtie2 (v2.1.0) [31]. Expression abundance was then calculated by RSEM using the default parameters [36]. Heatmaps were visualized using the R package pheatmap. Differential expression analysis was conducted using edgeR [34]. A threshold of p-value < 0.05 and fold change > 2 were used to decide whether there were any significant differences in expression between samples.

Whole-genome bisulfite sequencing (BS-seq) analysis
Previously published whole-genome bisulfite sequencing data for morc mutants and wild type was reanalyzed [6]. Briefly, Trim_galore (http:// www. bioin forma tics. babra ham. ac. uk/ proje cts/ trim_ galore/) was used to trim adapters. BS-seq reads were aligned to the TAIR10 reference genome by BSMAP (v2.90), allowing two mismatches and one best hit (-v 2 -w 1) [41]. Reads with three or more consecutive CHH sites were considered to be unconverted reads and were filtered out. DNA methylation levels were defined as #C/ (#C + #T).